SDA 3.5 Documentation for CASEStoDDL
NAME
CASEStoDDL - Create a DDL file from CASES (version 4)
DESCRIPTION
SDA programs can be used to document data collected by the
CASES
system for computer-assisted interviewing. In order to use the
SDA programs, a
DDL file
needs to be generated that describes a specific data file
generated by the CASES system. This document summarizes the
procedures necessary to create such a DDL file.
There are many options available to customize the content of a
DDL file produced from a CASES instrument. However, there are
default options which will usually produce satisfactory results,
at least as a starting point. The purpose of this document is to
illustrate how to run the procedures in a simple way, taking
advantage of the default options.
STEPS OF THE PROCESS
-
Create a list of variables to output
-
Create a list of cases to include in the
data file
-
Create the data file by running the CASES
‘output’ program
-
Run the SDA ‘q4toddl’ program
-
Add study-level information to the DDL file
-
Check the DDL file
1. CREATE A LIST OF VARIABLES TO OUTPUT
Create a list of variables that you want to include in your data
file. The list of variables should have one variable name on
each line. Note that there are usually many variables used by
the interviewing system that you will not want to pass through to
the data file.
One way to obtain a list of variables is to run the CASES
‘layout’ program to generate a list of variables. When running
the ‘layout’ program, use the ‘-b’ (brief) and the ‘-o’ (only
variables) options.
- The ‘-b’ option requests ‘brief’ output, with item names
alone, and without other information on each item.
- The ‘-o’ option requests that only items with a defined
width be listed. This means that the ‘nodata’ items are omitted.
That list produced by ‘layout’ can then be edited down to the
variables that are of substantive interest -- by deleting fills,
and various other non-input items.
Save the final list of variables in a file named something like
‘myvars’, for input into the
third stage
of the process.
2. CREATE A LIST OF CASES TO INCLUDE IN THE DATA FILE
You will usually want to include only the completed cases in the
data file. In order to do this, you must prepare a list of the
cases to be output by the CASES system.
The CASES ‘caselist’ program produces lists of cases, according
to criteria that you specify. The precise criteria to use depend
on your treatment of completed cases -- whether they have been
run through the second-stage cleaning process, for example.
Typically, you would specify that the ‘caselist’ program produce
a list of all cases that are in one of the following stages: in
‘middle’ or in ‘ready’ or in ‘certified’.
Save that list of cases in a file named something like ‘idlist’,
for input into the next stage of the process
3. CREATE THE DATA FILE BY RUNNING THE CASES OUTPUT PROGRAM
To create the data file that will be documented, run the CASES
‘output’ program, using the ‘-i = filename’ option. (Do NOT run
the CASES ‘output’ program without the ‘-i’ option. See
below
for an explanation of why not.) The ‘-i’ option is used to
specify the name of a file containing a list of the variables
that you want to include in the data file. This is the file you
produced in
step #1 above.
For example, if the file ‘myvars’ contains a list of variables
for CASES to output, and if the file ‘idlist’ contains a list of
the case IDs to be output, you could use the following command:
output -i=myvars -ou=mydata idlist
In this example, the ‘output’ program would generate two files:
- An ASCII data file named ‘mydata’.
- A layout file named ‘myvars.lay’ that gives the locations of
each variable in the file ‘mydata’.
If you do NOT use the ‘-i’ option, the ‘output’ program will
produce a large data file with many variables you probably do not
want to include in a dataset for analysis. Also, you will not
get a layout file for the variables you want -- rather, you will
have to rely on the comprehensive layout produced by the CASES
‘layout’ program. That layout refers to variables from the ZERO
record as being located in record 0, which will cause problems if
you try to pass those locations on to other programs.
4. RUN THE SDA ‘Q4TODDL’ PROGRAM TO MAKE A DDL FILE
The
Q4TODDL
program gathers information both from the CASES instrument and
from the layout file, and then it puts the pieces together in the
form of a DDL file. The text of questions and the category
labels are taken from the CASES instrument. The location of each
variable in the data file is taken from the layout file.
The process includes the following steps:
- Change to the directory containing the instrument files,
using a command like this:
cd \xstudy\e-inst
- Create a file of commands (here named ‘q4toddl.txt’)
With any text editor that creates a plain ASCII file, create a
file containing commands that specify the desired options for
Q4TODDL. Any name will do for the file; in this example we will
call the command file ‘q4toddl.txt’. An example that will cover
most situations is the following:
#_____(Command file for Q4TODDL -- named ‘q4toddl.txt’)__________
# Note that lines beginning with ‘#’ are interpreted as comments.
# Begin the commands in column 1 of your file.
#
# The first command is a list of .q or .m files used by CASES
QFiles = file1.q file2.q
Varlist = myvars
Layout = myvars.lay
Output = myddl.txt
#__________(End of command file)_________________________________
The above command file will work fine, assuming that the list of
variables you want is in ‘myvars’ and your layout file is
‘myvars.lay’, and you are running the Q4TODDL program in the same
directory as the ‘.q’ files (or the macro-expanded ‘.m’ files).
(For more options and explanations, see the full
Q4TODDL
document.)
The DDL output will be written to the file ‘myddl.txt’, which was
the name specified in the command file.
- Run the Q4TODDL program, giving the name of the command file
after the ‘-b’ flag.
q4toddl -b q4toddl.txt
This command will create a DDL file (named ‘myddl.txt’ in the
command file).
- Examine the diagnostic output
Diagnostic and error messages are appended to the file
‘Q4TODDL.MSG’. It is always a good idea to take a look at that
file.
5. ADD STUDY-LEVEL INFORMATION TO THE DDL FILE
Information about the dataset as a whole is not contained either
in the CASES Q-language file or in the layout file. You can edit
that information manually into the DDL file.
The main required elements are:
- A study title: title= ...
- The number of characters in each data record or line:
reclen = xxx
- If the data file contains more than one record per case, you
must indicate that as well:
records/case = n
- The first variable definition after the study-level
information MUST be a variable named ’CASEID’. You can edit the
location and description of a suitable variable into the blank
field produced by Q4TODDL.
For a complete description of the required format of a DDL file,
see the
DDL
document.
6. CHECK THE DDL FILE
After you have run Q4TODDL, added the required study-level
information, and made any other changes you want, you can check
the resulting DDL file for syntax errors. The MAKESDA program
will do this for you, if you use the ‘-c’ option.
For a DDL file named ’myddl.txt’, you would give the following
command:
makesda -c -l myddl.txt
Some messages will appear on the screen. A fuller report will be
appended to the file ‘MAKESDA.MSG’. Also, note that a list of
all variables processed will be put into the file ‘MAKESDA.LST’.
Once you have a DDL file without errors, you can proceed to
create an SDA dataset and generate a codebook.
SEE ALSO:
DDL |
Data Description Language |
makesda |
Make an SDA dataset from a DDL file and a data file |
q4toddl |
Convert CASES Q-language files to DDL |
xcodebk |
Produce a codebook |
CSM, UC Berkeley
April 12, 2011